Segmentation  of  Scenes  in  Exploratory  Mode 

Final  Report 

Ruzena  Bajcsy 
Max  Mintz 

Department  of  Computer  and  Information  Science, 

University  of  Pennsylvania 
(215)  898-0370 
(215)  573-2048 
bajcsy@central.cis.upenn.edu 
Segmentation  of  Scenes  in  Exploratory  Mode 
N00014-88-K-0630 
4331742-01 
01  Oct  91  -  30  Sep  92 

1  Objective 

GRASP  Laboratory  research  combines  Active  Perception  and  Robotics  to 
vices  capable  of  performing  sophisticated  tasks.  This  research  specifically  concentrates  on  multh 
spectral  image  processing,  3-D  shape  identification,  decision  making  and  robot  actuation.  Percep¬ 
tion  via  manipulation  is  combined  with  information  obtained  from  a  variety  of  sensors  to  establish 
one  or  more  features  or  properties  of  an  unstructured  environment.  This  links  exploration  of  an 
unknown  environment  by  visual  sensing,  range  measurement,  manipulation  and  physical  probing. 
It  is  a  direct  application  of  our  theoretical  work  in  robust  multisensor  fusion  and  techniques  for 
integrating  data  from  multiple  modalities. 

One  of  the  primary  objectives  of  this  research  is  to  investigate  coordination  and  communication 
of  multi-agent  systems.  In  particular,  multiple  agents  explore  and  adapt  to  their  surroundings 
and  organize  and  configure  themselves  to  perform  required  tasks  with  possible  assistance  of  human 
agents. 


DTIC 

ELECTE 

OCI  3  0 1995 

G 


produce  intelligent  de- 


PI  Name: 

Co-PI  Name: 

PI  Institution: 

PI  Phone  Number: 

PI  Fax  Number: 

PI  E-mail  Address: 

Grant  Title: 

Grant /Contract  Number: 
R&T  Number: 

Reporting  Period: 


2  Approach 


Our  approach  is  based  on  an  “advice-based  smaU-team”  architecture.  The  agents  are  heterogeneous 
in  both  their  scope  of  applicability  or  functionality  and  their  capabilities  or  competence.  Their  con¬ 
nectivity  depends  on  their  shared  versus  independent  domain  of  applicability  and/or  task/subtasks. 
An  example  of  a  shared  domain  is  an  obstacle  monitored  by  vision  and  acoustic  sensors.  These 
two  agents  perform  redundant  or  complimentary  functions.  On  the  other  hand,  an  example  of 
independent  agents  would  be  the  force  sensors  that  monitor  the  contact  and  sliding/rolling  of  an 
object  held  by  two  palms,  while  the  acoustic  sensors  monitor  the  vehicles  to  avoid  obstacles.  The 
advice-based  smaU-team  architecture  is  new  in  that  it  provides  as  much  autonomy  as  possible  to 
individual  agents,  yet  it  makes  all  the  possible  information  accessible  to  other  agents  in  the  spirit 
of  cooperation.  AU  the  agents  know  the  common  task  of  transporting  an  object  from  place  A  to 
place  B.  Since  aU  the  agents  are  physical,  the  real  time  issue  becomes  apparent! 


distribution  Statement  k 

Approved  for  public  release; 
Distribution  UBlimited. 


3  Progress 

Progress  in  the  last  year  of  the  project  has  been  made  in  the  areas  of  control,  the  observer  agent, 
and  multisensor  fusion. 


3.1  Control 

The  control  of  individual  mobile  manipulators  has  been  investigated.  The  control  of  a  mobile  ma¬ 
nipulator  involves  the  coordination  of  locomotion  of  the  mobile  platform  and  manipulation  of  the 
manipulator.  The  coordination  of  locomotion  and  manipulation  is  important  for  a  number  of  rea¬ 
sons  including  redundancy  in  mobility,  difference  in  dynamic  response,  nonholonomic  constraints, 
and  dynamic  interactions.  Modeling  the  mobile  platform  as  a  nonholonomic  dynamic  system,  we 
have  developed  and  experimentally  tested  a  control  algorithm  for  coordinating  locomotion  and  ma¬ 
nipulation.  Using  this  algorithm,  while  the  manipulator  is  dragged  by  an  operator  in  any  direction 
in  a  horizontal  plane,  the  mobile  platform  is  able  to  bring  the  manipulator  into  the  configuration 
with  maximum  manipulability  measure. 

In  further  research  we  will  integrate  the  wrist  force/torque  sensor  in  the  control  algorithm,  thus 
enabling  the  mobile  manipulator  to  maintain  contact  with  and  follow  a  moving  surface  rather  than 
being  dragged. 


Accesion  For 


3.2  Observer  Agent 


□ 


The  function  of  the  Observer  Agent  is  to  recognize  the  environment  (in  real  time),  in  particular  the 
free  path  and  the  obstacles.  While  there  are  numerous  algorithms  in  the  literature  describing  how  to 

extract  optical  flow,  range  and  motion  parameters,  most  of  them  are  too  complex  to  run  in  real  time  . 

(15  frames  per  second)  and  not  robust  enough.  For  the  real-time  processing  we  have  concentrated  . 

on  proper  data  reduction  mechanisms  (data  selection)  via  the  use  of  different  optics,  and  model- 
based  tracking.  For  the  robustness  question,  we  have  concentrated  on  removing  the  highlights 

and  shadows  using  color  and  active  light.  These  algorithms  are  based  on  point  transformations  ^ _ _ 

(difference  of  two  images),  and  hence  are  highly  parallel. 

3.3  Multisensor  Fusion 

Our  multi-agent  system  employs  the  following  sensors:  multiple  cameras  which  simultaneously 
provide  images  from  multiple  fields  of  view  and  varying  depths  of  field;  digital  compasses  on  the 
mobile  agents;  odometry  from  the  wheels  of  the  mobile  agents;  acoustic  range  sensors  on  the  mobile 
agents;  infrared  proximity  sensors  on  the  mobile  agents;  and  force  and  torque  sensors  on  the  end- 
effectors  of  manipulator  agents.  These  sensors  provide  information  of  different  types  and  qualities. 

One  research  issue  is  delineation  of  decision  models  for  combining  or  fusing  information  with  a 
common  type  with  differing  qualities,  as  weU  as  the  fusion  of  information  of  different  types  with 
varying  quality.  We  have  developed  a  mathematical  model  for  fusing  information  of  a  common  type 
where  one  source  provides  coarse-grained  information  with  good  reliability  and  the  other  source 
provides  fine-grained  information  but  may  be  subject  to  serious  sporadic  errors.  For  example,  we 
can  use  optics  or  infrared  technology  for  coarse  range  determination.  Within  specified  domains  of 
operation,  these  coarse  range  estimates  will  be  reliable;  we  can  use  acoustic  range  sensing  for  fine¬ 
grained  range  information  —  with  the  caveat  that  the  acoustic  range  information  may  be  seriously 


2 


in  error  due  to  multipath  or  insufficient  target  cross-section.  These  models  and  techniques  do  not 
rely  on  either  highly  refined  sensor  noise  models  or  highly  accurate  sensor  position  information. 
Both  of  these  additional  sources  of  errors  are  accounted  for  in  this  methodology. 


4  Accomplishments 

Accomplishments  in  the  last  year  of  the  project  include: 

•  Recognition  of  highlights  for  dialectric  materials  and  metallic  materials 

•  Recognition  of  shadows  using  active  light. 

•  Simultaneous  real-time  (15  frames  per  second)  model-based  2D  tracking  of  multiple  objects. 

•  Near-optimal  robust  fixed-size  confidence  procedures. 

•  Robustness  with  respect  to  noise  distribution  uncertainty,  applicable  to  essentially  any  class 
of  noise  distributions  which  have  smooth  (non-atomic)  boundaries. 

•  Near-optimal  performance  obtained  using  easily  computed  non-monotone  functions. 


5  Technical  Reports 

1.  Tarek  M.  Sobh  and  Ruzena  Bajcsy.  Visual  Observation  Of  A  Moving  Agent.  Technical 
Report,  Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS- 
CIS-91-86,  GRASP  LAB  283. 

2.  Marcos  Salganicoff  and  Ruzena  Bajcsy.  Sensorimotor  Learning  Using  Active  Perception  In 
Continuous  Domains.  Technical  Report,  Department  of  Computer  and  Information  Science, 
University  of  Pennsylvania,  MS-CIS-91-87,  GRASP  LAB  284. 

3.  Eric  Paljug,  Tom  Sugar,  Vijay  Kumar  and  Xiaoping  Yun.  Important  Considerations  In  Force 
Control  With  Applications  To  Multi-Arm  Manipulation.  Technical  Report,  Department  of 
Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-91-88,  GRASP  LAB 
287. 

4.  Sanjay  Agrawal.  Robotic  Manipulation  Using  A  Behavioral  Framework.  Technical  Report 
(Dissertation),  Department  of  Computer  and  Information  Science,  University  of  Pennsylvania, 
MS-CIS-91-90,  GRASP  LAB  287. 

5.  Ruzena  Bajcsy  and  Mario  Campos.  Active  and  Exploratory  Perception.  Technical  Report, 
Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-91- 
91,  GRASP  LAB  288. 

6.  Ruzena  Bajcsy.  An  Active  Observer.  Technical  Report,  Department  of  Computer  and  Infor¬ 
mation  Science,  University  of  Pennsylvania,  MS-CIS-91-95,  GRASP  LAB  295. 

7.  Tarek  Sobh.  Active  Observer:  A  Discrete  Event  Dynamic  System  Model  For  Controlling  An 
Observer  Under  Uncertainty.  Technical  Report  (Dissertation),  Department  of  Computer  and 
Information  Science,  University  of  Pennsylvania,  MS-CIS-91-99,  GRASP  LAB  296. 


3 


8.  Gareth  D.  Funka-Lea.  Vision  For  Navigation  Using  Two  Road  Cues.  Technical  Report, 
Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-91- 
100,  GRASP  LAB  297. 

9.  Thomas  Lindsay.  Teleprogramming:  Remote  Site  Research  Issues.  Technical  Report,  De¬ 
partment  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-92-01, 
GRASP  LAB  298. 

10.  John  Bradley.  Interactive  Image  Display  For  The  X  Window  System.  Technical  Report, 
Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-92- 
04,  GRASP  LAB  299. 

11.  Luca  Bogoni.  Superguadric  Library,  User  Manual  and  Utility  Programs.  Technical  Report, 
Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-92- 
11,  GRASP  LAB  300. 

12.  Pramath  Raj  Sinha.  Robotic  Exploration  Of  Surfaces  and  Its  Application  To  Legged  Loco¬ 
motion.  Technical  Report,  Department  of  Computer  and  Information  Science,  University  of 
Pennsylvania,  MS-CIS-92-12,  GRASP  LAB  301. 

13.  Sang  Wook  Lee.  Understanding  Of  Surface  Reflections  In  Computer  Vision  By  Color  and 
Multiple  Views.  Technical  Report  (Dissertation),  Department  of  Computer  and  Information 
Science,  University  of  Pennsylvania,  MS-CIS-92-13,  GRASP  LAB  301. 

14.  Faculty  &  Graduate  Students.  Grasp  Laboratory  News,  Volume  8,  Number  1.  Technical 
Report,  Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS- 
CIS-92-15,  GRASP  LAB  302. 

15.  Yin-Tien  Wang  and  Vijay  Kumar.  Simulation  Of  Mechanical  Systems  With  Multiple  Fric¬ 
tional  Contacts.  Technical  Report,  Department  of  Computer  and  Information  Science,  Uni¬ 
versity  of  Pennsylvania,  MS-CIS-92-16,  GRASP  LAB  303. 

16.  Yoshio  Yamamoto  and  Xiaoping  Yun.  Coordinating  Locomotion  and  Manipulation  Of  A 
Mobile  Manipulator.  Technical  Report,  Department  of  Computer  and  Information  Science, 
University  of  Pennsylvania,  MS-CIS-92-18,  GRASP  LAB  304. 

17.  Eric  D.  Paljug.  Multi-Arm  Manipulation  Of  Large  Objects  With  Rolling  Contacts.  Technical 
Report,  Department  of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS- 
CIS-92-19,  GRASP  LAB  305. 

18.  Insup  Lee.  Proving  Properties  of  Real-Time  Distributed  Systems:  A  Comparison  of  Three 
Approaches.  Technical  Report,  Department  of  Computer  and  Information  Science,  University 
of  Pennsylvania,  MS-CIS-92-20,  GRASP  LAB  306. 

19.  Robert  Bruce  King,  11.  Design,  Implementation,  and  Evaluation  Of  A  Real-Time  Kernel 
For  Distributed  Robotics.  Technical  Report  (Dissertation),  Department  of  Computer  and 
Information  Science,  University  of  Pennsylvania,  MS-CIS-92-26,  GRASP  LAB  307. 

20.  Marcos  Salganicoff.  A  Robotic  System  for  Learning  Visually- Driven  Grasp  Planning.  Tech¬ 
nical  Report,  Department  of  Computer  and  Information  Science,  University  of  Pennsylvania, 
MS-CIS-92-27,  GRASP  LAB  308. 


4 


21.  Gerda  L.  Kamberova.  Robust  Location  Estimation  for  MLR  and  Non-MLR  Distributions. 
Technical  Report,  Department  of  Computer  and  Information  Science,  University  of  Pennsyl¬ 
vania,  MS-CIS-92-28,  GRASP  LAB  309. 

22.  Gerda  L.  Kamberova.  Markov  Random  Field  Models:  A  Bayesian  Approach  To  Computer 
Vision  Problems.  Technical  Report,  Department  of  Computer  and  Information  Science,  Uni¬ 
versity  of  Pennsylvania,  MS-CIS-92-29,  GRASP  LAB  310. 

23.  Robert  Mandelbaum.  Convergence  of  Stochastic  Processes.  Technical  Report,  Department 
of  Computer  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-92-30,  GRASP 
LAB  311. 

24.  Gerda  Kamberova,  Ray  McKendall  and  Max  Mintz.  Multivariate  Data  Fusion  Based  On 
Fixed-Geometry  Confidence  Sets.  Technical  Report,  Department  of  Computer  and  Informa¬ 
tion  Science,  University  of  Pennsylvania,  MS-CIS-92-31,  GRASP  LAB  312. 

25.  Jana  Kosecka.  Control  of  Discrete  Event  Systems.  Technical  Report,  Department  of  Com¬ 
puter  and  Information  Science,  University  of  Pennsylvania,  MS-CIS-92-35,  GRASP  LAB  313. 

26.  Luca  Bogoni  and  Ruzena  Bajcsy.  An  Active  Approach  To  Functionality  Characterization  and 
Recognition.  Technical  Report,  Department  of  Computer  and  Information  Science,  University 
of  Pennsylvania,  MS-CIS-92-37,  GRASP  LAB  315. 

27.  Mario  Fernando  Montenegro  Campos.  Robotic  Exploration  Of  Material  and  Kinematic  Prop¬ 
erties  Of  Objects.  Technical  Report  (Dissertation),  Department  of  Computer  and  Information 
Science,  University  of  Pennsylvania,  MS-CIS-92-38,  GRASP  LAB  316. 

28.  Ron  Katriel.  Parallel  Evidence- Based  Indexing  of  Complex  Three-Dimensional  Models  Using 
Prototypical  Parts  and  Relations.  Technical  Report,  Department  of  Computer  and  Informa¬ 
tion  Science,  University  of  Pennsylvania,  MS-CIS-92-39,  GRASP  LAB  317. 


6  Publications 


1.  Ruzena  Bajcsy.  An  Active  Observer.  Proceedings  of  the  ARPA  Image  Understanding  Work¬ 
shop,  pages  137-147,  San  Diego,  CA,  January  1992. 

2.  Mario  Campos,  Vijay  Kumar  and  Ruzena  Bajcsy.  Kinematic  identification  of  linkages.  Pro¬ 
ceedings  of  the  3rd  International  Conference  on  Advances  in  Robot  Kinematics,  Ferrara,  Italy, 
September  7-9,  1992. 

3.  Janez  Funda,  Thomas  Lindsay  and  Richard  P.  Paul.  Teleprogramming:  towards  delay- 
invariant  remote  manipulation.  Presence:  Teleoperators  and  Virtual  Environments,  Volume 
1,  Number  1;  MIT  Press,  January  1992. 

4.  Gareth  Funka-Lea  and  Ruzena  Bajcsy.  Vision  for  vehicle  guidance  using  two  road  cues. 
Proceedings  of  the  Intelligent  Vehicles  ’92  Symposium,  pages  126-131,  Detroit,  Michigan, 
June  1992. 

5.  Alok  Gupta,  Gareth  Funka-Lea  and  Kwangyoen  Wohn.  Segmentation,  modeling  and  classi¬ 
fication  of  the  compact  objects  in  a  pile.  Hatem  Nasr,  editor.  Selected  Papers  on  Automatic 
Object  Recognition,  SPIE  Milestone  Series,  1991. 


5 


6.  Gerda  Kamberova,  Ray  McKendall  and  Max  Mintz.  Multivariate  Data  Fusion  Based  on 
Fixed- Geometry  Confidence  Sets.  SPIE  Proc,  of  the  International  Symposium  on  Advances 
in  Intelligent  Systems^  Session  on  Sensor  Fusion^  November  1991. 

7.  V.  Koivunen  and  M,  Pietikainen.  Evaluating  quality  of  surface  description  using  robust 
methods.  11th  International  Conference  on  Pattern  Recognition^  The  Hague,  Netherlands, 
pp.  214-218,  1992. 

8.  V.  Koivunen  and  M.  Pietikainen.  Experiments  with  combined  edge  and  region-based  range 
image  segmentation.  Theory  and  Applications  of  Image  Analysis,  World  Scientific  Publica¬ 
tions,  pp.  162-176,  1992. 

9.  Vijay  Kumar.  Characterization  of  workspaces  of  parallel  manipulators.  ASME  Journal  of 
Mechanical  Design,  114(3):368-375, 1992. 

10.  Vijay  Kumar.  A  compact  inverse  velocity  solution  for  redundant  robots.  In  Proceedings  of 
1992  International  Conference  on  Robotics  and  Automation,  pages  482-487,  Nice,  France, 
May  1992. 

11.  V.  Kumar,  T.  G.  Sugar  and  G.  Pfreundschuh.  A  three  degree  of  freedom  in-parallel  actuated 
manipulator.  Proceedings  of  the  9th  CISM-IFToMM  Symposium  on  Theory  and  Practice  of 
Manipulators,  Udine,  Italy,  September  1-4,  1992. 

12.  Sang  Wook  Lee  and  Ruzena  Bajcsy.  Detection  of  specularity  using  color  and  multiple  views. 
Image  and  Vision  Computing,  10:643-653,  1992. 

13.  Sang  Wook  Lee  and  Ruzena  Bajcsy.  Detection  of  Specularity  Using  Color  and  Multiple  Views. 
Proc,  of  Second  European  Conference  on  Computer  Vision,  Santa  Margherita  Ligure,  Italy, 
1992.  Outstanding  Paper  Award. 

14.  Ales  Leonardis  and  Ruzena  Bajcsy.  Finding  parametric  curves  in  an  image.  Proc.  of  Second 
European  Conference  on  Computer  Vision,  Santa  Margherita  Ligure,  Italy,  1992. 

15.  Jasna  Maver  and  Ruzena  Bajcsy.  Occlusions  and  the  Next  View  Planning.  IEEE  Int.  Conf 
on  Robotics  and  Automation,  May  1992. 

16.  Ray  McKendall  and  Max  Mintz.  Robust  Sensor  Fusion  with  Statistical  Decision  Theory. 
Data  Fusion  in  Robotics  and  Machine  Intelligence,  M.A.  Abidi  and  R.C.  Gonzalez,  editors. 
Academic  Press,  Spring,  1992. 

17.  Mohamed  OuerfeUi,  William  Harwin  and  Vijay  Kumar.  A  pneumatic  actuation  system  for  a 
wheelchair-mounted  robot  arm.  15th  RESNA  International  Conference,  Toronto,  June  6-11, 
1992. 

18.  Eric  Paljug,  Thomas  Sugar,  Vijay  Kumar  and  Xiaoping  Yun.  Important  considerations  in 
force  control  with  applications  to  multi-arm  manipulation.  IEEE  International  Conference 
on  Robotics  and  Automation,  pp.  1270-1275,  Nice,  France,  May  10-15,  1992. 

19.  Richard  Paul,  Thomas  Lindsay  and  Craig  Sayers.  Time  delay  insensitive  teleoperation.  Pro¬ 
ceedings  of  the  1992  lEEE/RSJ  International  Conference  on  Intelligent  Robots  and  Systems, 
pages  247-254,  July  1992. 


6 


20.  Richard  Paul,  Thomas  Lindsay,  Craig  Sayers  and  Matt  Stein.  Time-delay  insensitive  virtual- 
force  reflecting  teleoperation.  Artificial  Intelligence^  Robotics  and  Automation  in  Space^ 
pages  55-67,  Toulouse,  France,  September  1992. 

21.  Marcos  SalganicofF  and  Ruzena  Bajcsy.  Robot  sensorimotor  learning  in  continuous  domains. 
Proceedings  of  1992  International  Conference  on  Robotics  and  Automation^  Nice,  France,  May 
1992. 

22.  Pramath  R.  Sinha  and  Ruzena  K.  Bajcsy.  Implementation  of  an  Active  Perceptual  Scheme 
for  Legged  Locomotion  of  Robots.  Proceedings  of  the  Fourth  International  Workshop  on 
Intelligent  Robots  and  Systems  (IROS  ^91)^  pages  1518-1523,  Osaka,  Japan,  November  1991. 

23.  Pramath  R.  Sinha  and  Ruzena  K.  Bajcsy.  Robotic  Exploration  of  Surfaces  and  its  Application 
to  Legged  Locomotion.  Proceedings  of  the  IEEE  International  Conference  on  Robotics  and 
Automation^  Nice,  France,  May  1992. 

24.  Tarek  M.  Sobh  and  Ruzena  Bajcsy.  A  Model  for  Observing  a  Moving  Agent.  Proceedings 
of  the  Fourth  International  Workshop  on  Intelligent  Robots  and  Systems  (IROS  ^91),  Osaka, 
Japan,  November  1991. 

25.  Tarek  M.  Sobh  and  Ruzena  Bajcsy.  A  Model  for  Visual  Observation  Under  Uncertainty.  1992 
IEEE  Symposium  on  Computer  Aided  Control  System  Design  (CACSD  ’92)^  March  1992. 

26.  Tarek  M.  Sobh  and  Ruzena  Bajcsy.  Autonomous  Observation  Under  Uncertainty.  IEEE 
International  Conference  on  Robotics  and  Automation^  Nice,  France,  May  1992. 

27.  Matt  Stein  and  Richard  Paul.  Kinesthetic  replay  for  error  diagnosis  in  time  delayed  teleop¬ 
eration.  SPIE  OE/Technology  ^92:  Telemanipulator  Technology^  1992. 

28.  Chau-Chang  Wang,  Nilanjan  Sarkar  and  Vijay  Kumar.  Rate  kinematics  of  mobile  manip¬ 
ulators.  Proceedings  of  the  22nd  Biennial  ASME  Mechanisms  Conference^  pages  225-232, 
Scottsdale,  AZ,  September  1992. 

29.  Y.  Wang  and  V.  Kumar.  Simulation  of  mechanical  systems  with  unilateral  constraints.  Pro¬ 
ceedings  of  the  22nd  Biennial  ASME  Mechanisms  Conference^  Scottsdale,  AZ,  September 
1992. 

30.  Y.  Wang,  V.  Kumar,  and  J.  Abel.  Dynamics  of  rigid  bodies  undergoing  multiple  frictional 
contacts.  IEEE  International  Conference  on  Robotics  and  Automation^  pp.  2764-2769,  Nice, 
France,  May  10-15  1992. 

31.  Yangsheng  Xu,  Xiaoping  Yun  and  Richard  P.  Paul.  Nonlinear  feedback  control  of  robot 
manipulator  and  compliant  wrist.  Dynamics  and  Control^  (l):325-339,  1991. 

32.  Xiaoping  Yun.  Modeling  and  control  of  two  constrained  manipulators.  Journal  of  Intelligent 
and  Robotic  Systems^  (4):363-377,  1991. 

33.  Xiaoping  Yun.  Nonlinear  feedback  for  force  control  of  manipulators.  C.T.  Leondes,  editor, 
Control  and  Dynamic  Systems^  pages  259-283,  Academic  Press,  New  York,  1991. 

34.  Xiaoping  Yun  and  Daizhan  Cheng.  Input-output  decoupled  linearization  of  general  nonlinear 
systems.  Transactions  of  the  Institute  of  Measurement  and  Control,  13(4):218-224, 1991. 


7 


35.  Xiaoping  Yun  and  Vijay  Kumar.  An  approach  to  simultaneous  control  of  trajectory  and  in¬ 
teraction  forces  in  dual  arm  configurations.  IEEE  Transactions  on  Robotics  and  Automation^ 
7(5):618-625,  October  1991. 

36.  Xiaoping  Yun,  Vijay  Kumar,  Nilanjan  Sarkar  and  Eric  Paljug.  Control  of  multiple  arms 
with  rolling  constraints.  1992  IEEE  International  Conference  on  Robotics  and  Automation^ 
pages  2193-2198,  Nice,  France,  May  1992. 


8 


An  Active  Observer 


Ruzena  Bajcsy* 

GRASP  Laboratory 

Computer  and  Information  Science  Department 
University  of  Pennsylvania 
Philadelphia,  PA  19104 


1  Abstract 

In  this  paper  we  present  a  framework  for  research  into 
the  development  of  an  Active  Observer.  The  com¬ 
ponents  of  such  an  observer  axe  the  low  and  intermedi¬ 
ate  visual  processing  modules.  Some  of  these  modules 
have  been  adapted  from  the  community  and  some  have 
been  investigated  in  the  GRASP  laboratory,  most  no¬ 
tably  modules  for  the  understanding  of  surface  reflec¬ 
tions  via  color  and  multiple  views  and  for  the  segmen¬ 
tation  of  three  dimensional  images  into  first  or  second 
order  surfaces  via  superquadric/parametric  volumetric 
models.  However  the  key  problem  in  Active  Observer 
research  is  the  control  structure  of  its  behavior  based 
on  the  task  and  the  situation.  This  control  structure  is 
modeled  by  a  formalism  called  Discrete  Events  Dynamic 
Systems  (DEDS). 

2  Introduction 

We  are  interested  in  the  development  of  an  Active  Ob¬ 
server.  An  Active  Observer  is  an  agent  which  has  capa¬ 
bilities  to  observe  scenes,  objects,  situations  and  deliver 
the  observed  information  to  human,  manipulatory,  and 
mobile  agents.  Naturally  there  are  more  questions  than 
answers.  We  shall  list  a  few  which  are  of  particular  in¬ 
terest  to  us.  What  are  the  components/modules  that 
such  an  observer  must  have?  How  are  these  components 
interconnected,  i.e.  what  is  the  architecture  of  such  an 
agent?  Some  of  the  modules  correspond  to  certain  vi¬ 
sual  cues.  We  take  as  a  given  that  our  observer  has 
several  such  cues.  In  that  case,  the  subsequent  ques¬ 
tion  is  how  are  the  results  from  these  cues  integrated? 
When  are  they  invoked?  How  is  the  selection  process 
conducted/guided?  Which  cue  is  employed  and  when? 
Finally,  what  kind  of  information/messages  is  delivered 
by  the  observer  to  other  agents? 

Towards  this  end,  for  the  last  two  years  we  have  con¬ 
centrated  on  the  development  of  theoretical  and  experi¬ 
mental  understanding  of  some  of  the  cues/components, 
some  cues’  integration  and  selection,  and  control  strate¬ 
gies  for  observation  capability.  In  particular,  in  cue  de¬ 
velopment  we  have  tried  to  understand  surface  reflec- 

*  Acknowledgement:  Navy  Grant  N0014-88-Iv-0630, 

AFOSR  Grants  88-0244,  AFOSR  88-0296;  Ariny/DAAL  03- 
89-C-0031PRI;  NSF  Grants  CISE/CDA  88-22719,  IRI  89- 
06770:  and  Dupont  Corporation 


tions  by  color  and  multiple  views.  An  important  finding 
of  this  work,  which  will  be  described  in  detail  in  Section 
2,  is  that  multiple  view  points  provide  useful  information 
for  discriminating  between  specular  and  Lambertian  re¬ 
flections  both  from  dielectrics  and  from  metals.  In  Sec¬ 
tion  3,  we  shall  describe  a  system  for  the  segmentation 
of  a  three  dimensioned  scene  into  components  that  can 
be  modeled  by  superquadric  parametric  fit.  This  system 
uses,  in  cooperation,  surface  segmentation,  contour  seg¬ 
mentation  and  gross  volumetric  segmentation  in  order  to 
arrive  at  the  proper  result.  The  scenes  are  of  moderate 
complexity  (up  to  10  parts),  but  no  other  assumptions 
are  made  about  objects  or  their  parts.  This  work  points 
to  the  common  fact  that  one  module  or  cue  or  approach 
cannot  handle  the  perceptual  variety  of  the  data  that 
the  real  world,  even  in  moderate  complexity,  represents. 
Multiple  cues  zire  necessary  and  hence  a  great  deal  of 
thought  has  to  go  into  the  integration  policy  and  control 
structure.  In  Section  4,  we  present  a  formal  model  of 
an  observer  agent.  This  model  is  based  on  the  theory  of 
Discrete  Event  Dynamic  Systems  (DEDS),  which  allows 
us  to  unequivocally  predict  the  observation  capabilities 
of  an  observer.  In  order  for  this  to  occur,  the  observer 
must  know  the  discrete  events  of  the  task.  So  far  this 
is  done  by  the  designer.  Finally,  in  Section  5  we  show 
the  recent  development  of  a  CCD  chip  (the  Retina)  with 
space  variant  resolution.  Details  are  described  in  this 
section. 

3  Understanding  of  Reflection 

Properties  Using  Color  and  Multiple 
Views 

Recently  there  has  been  a  growing  interest  in  the  detec¬ 
tion  of  specularity  in  both  basic  and  applied  computer 
vision  research.  In  general,  the  detection  of  speculari- 
ties  from  a  single  gray-level  image  is  a  physically  under¬ 
constrained  problem,  and  more  information  needs  to  be 
collected  in  physically  sensible  ways  to  solve  the  prob¬ 
lem.  Successful  development  of  an  algorithm  for  image 
data  collection  and  interpretation  necessarily  depends 
on  physical  models  that  describe  how  surfaces  appear 
according  to  the  illumination  and  reflectance  properties 
and  sensor  characteristics.  Recently  the  computer  vi¬ 
sion  field  has  increasingly  incorporated  methodologies 
derived  from  physical  principles  of  image  formation  and 


137 


sousing  [7].  So  far  thoro  havo  been  three  types  of  ap¬ 
proaches  to  solving  the  problem  of  specularity  detection 
through  the  collection  of  more  images:  (1)  with  differ¬ 
ent  light  directions.  (2)  with  different  sensor  polarization 
angles,  and  (3)  with  different  color  sensors. 

The  pilot omctric-stereo-type  approaches  consider  the 
specular  and  Lambertian  reflectance  properties  for  ob¬ 
taining  object  shape  using  more  than  two  light  directions 
[A]  [9]  [11].  Since  the  direction  and  the  degree  of  the 
collimation  of  the  illumination  need  to  be  strictly  con¬ 
trolled,  application  of  the  approach  is  restricted  to  dark¬ 
room  environments.  The  polarization  method  analyzes 
the  polarization  of  reflected  light  and  detects  specular- 
ities  from  dielectrics  and  metals  [12].  The  polarization 
approach  places  some  restrictions  on  the  incident  illumi¬ 
nation  direction  with  respect  to  surface  orientation. 

The  dichromatic  model  [10]  proposed  by  Shafer  has 
been  the  key  model  to  the  recent  specularity  detection 
algorithms  using  color  [8]  [5]  [6]  [3].  The  basic  limita¬ 
tion  of  the  color  algorithms  is  that  objects  must  be  only 
colored  dielectrics  to  use  the  dichromatic  model.  For 
color  image  segmentation,  it  is  usually  assumed  that  ob¬ 
ject  surface  reflectance  is  spatially  piecewise  uniform  in 
color  and  that  scene  illumination  is  singly  colored.  We 
have  previously  developed  a  color  image  segmentation 
algorithm  for  the  separation  of  diffuse^  as  well  as  sharp, 
specularities  and  inter-reflections  from  Lambertian  re¬ 
flections  [3]. 

Our  recent  rese2irch  has  focused  on  the  development 
of  some  specularity  detection  or  separation  methods  that 
only  require  modification  of  sensors  but  not  any  modifi¬ 
cation  of  environments.  In  other  words,  they  are  meth¬ 
ods  that  are  active  in  modifying  sensors  but  passive  in 
modifying  environments.  There  are  two  kinds  of  modifi¬ 
cation  of  environments:  relocation  and  re-orientation  of 
objects  by  robot  manipulation,  and  illumination  change. 
The  prime  example  of  the  illumination  change  is  the 
light  switching  for  the  photometric-stereo-type  meth¬ 
ods.  Since  illumination  lighting  needs  to  be  strictly  con¬ 
trolled,  the  photometric-stereo-type  approaches  are  ap¬ 
plicable  only  for  inspection  in  dark  rooms. 

Strict  illumination  control  is  not  always  possible  in 
investigating  surface  reflection  properties  in  many  gen¬ 
eral  environments.  Examples  include  outdoor  inspec¬ 
tion,  indoor  or  outdoor  navigation,  and  exploratory  en¬ 
vironments.  Even  for  indoor  inspection,  a  well  controlled 
dark  room  is  not  always  available. 

For  general  environments  without  strict  illumination 
control,  only  sensors  are  controllable,  and  color  and  po¬ 
larization  can  be  the  possible  cues.  Another  possibility 
is  to  move  the  observer,  which  has  not  been  used  for  in¬ 
vestigating  reflection  properties  in  computer  vision.  The 
idea  of  moving  the  observer  was  directly  motivated  by 
the  concept  of  active  vision  [2].  For  low-level  vision  prob¬ 
lems  of  shape  or  structure,  it  has  been  demonstrated  that 
many  ill-posed  problems  become  well-posed  if  more  in¬ 
formation  is  collected  by  active  sensors  [1].  Although 
the  paradigms  for  shape  or  structure  based  on  feature 
correspondence  cannot  be  directly  applied  to  the  study 
of  reflectance  properties,  the  idea  of  a  moving  observer 
motivated  the  investigation  of  new  principles  by  physical 


modeling  in  obtaining  more  information. 

In  this  paper,  we  suggest  the  use  of  multiple  \ic\\^ 
for  the  detection  of  specularity  by  introducing  two  algcn 
rithms.  The  first  algorithm,  called  spectral  differencing, 
uses  color  information  from  a  small  number  of  multiplr 
views.  Tlie  second  algorithm  is  called  view  sampling. 
Using  many  views  of  gray-level  images  collected  in  widr 
angle,  the  view  sampling  reconstructs  object  structur<' 
and  detects  specularities.  An  important  principle  us(?d 
for  the  algorithms  is  the  Lambertian  consistency,  which 
is  the  well-known  fact  that  the  Lambertian  reflection 
does  not  change  its  brightness  and  spectral  content  de  ¬ 
pending  on  viewing  directions,  but  the  specular  reflec¬ 
tion  or  the  mixture  of  Lambertian  and  specular  refloe - 
tions  can  change. 

A  problem  associated  with  the  use  of  multiple  views 
with  color  is  what  kind  of  extra  spectral  information  can 
be  obtained  by  moving  a  color  camera  without  consid¬ 
ering  object  geometry.  If  there  is  any,  it  may  alleviate^ 
the  limiting  assumptions  imposed  on  the  object  and  illu¬ 
mination  domain  for  the  color  segmentation  approaches, 
and  provide  higher  confidence  in  detecting  specularities. 

The  spectral  differencing  algorithm  is  based  on  th(‘ 
observation  that  any  presence  of  specular  reflections  can 
be  inferred  by  the  difference  in  the  distribution  of  pixel 
colors  between  two  color  images.  According  to  the  Lam¬ 
bertian  consistency,  the  color  distribution  of  pixels  from 
only  Lambertian  reflections  should  be  consistent  regar 
less  of  view  points.  On  the  other  hand,  specularities 
or  the  mixture  of  specular  and  Lambertian  reflections 
can  change  the  distribution  of  pixel  colors  between  two 
views. 

The  spectral  differencing  algorithm  does  not  require 
any  assistance  from  image  segmentation  and  geornetri- 
cal  manipulation.  Since  the  algorithm  does  not  rely  on 
the  segmentation  and  the  dichromatic  model,  it  is  appli¬ 
cable  to  dielectric  objects  with  nonuniform  reflectance 
and  metals  under  multiply  colored  illumination.  Fig¬ 
ures  1  and  2  show  two  dielectric  objects  with  varia¬ 
tion  in  reflectance  and  a  metallic  object  in  neutral  re 
flectance  color.  Two  fluorescent  light  tubes  and  a  tung¬ 
sten  light  bulb  are  used  for  illumination  and  there  are 
inter-reflections  from  the  walls.  MSD(0  1)  shows  the 
regions  of  new  color  distribution  in  view  0  compared  to 
view  1,  and  MSD(1  ^  0)  the  regions  of  new  color  dis¬ 
tribution  in  view  1  compared  to  view  0.  Under  multipiv 
colored  and  extended  illumination,  it  can  be  seen  that 
most  of  the  specularities  are  detected  by  the  spectral 
differencing. 

Another  approach  we  introduce  is  to  obtain  reflection 
properties  using  only  multiple  views  without  any  color 
information.  With  densely  sampled  views  in  wid-  an¬ 
gle  and  with  known  viewing  directions,  the  view  sam¬ 
pling  algorithm  reconstructs  object  sti  act uie  as  wdl  as 
detects  specularities  from  Lambertian  reflections.  Ihe 
view  sampling  algorithm  is  applicable  to  dielectrics  and 
metals. 

If  object  structure  is  reconstructed  assuming  the  Lam¬ 
bertian  consistency  for  both  Lambertian  and  specular  re¬ 
flections,  the  structure  reconstructed  from  the  specular 
reflections  would  not  in  general  represent  the  real  object 


Figure  1:  Spectral  differencing 


Figure  2:  Spectral  differencing 


surface,  while  the  one  reconstructed  from  the  Lamber¬ 
tian  reflections  does.  By  examining  the  differently  recon¬ 
structed  object  structures  from  specular  and  Lambertian 
reflections,  we  can  identify  the  reflection  types  and  the 
real  object  structure. 

We  adopted  an  algorithm  for  computerized  tomogra¬ 
phy  through  photometric  modeling  for  the  reconstruc¬ 
tion  of  object  structure.  Figure  3  shows  the  camera  con¬ 
trol  scheme  and  Figure  1  (a)  shows  4  out  of  30  view  sam¬ 
ples  of  a  gray  dielectric  object  from  different  view  points. 
Figure  4  (c)  and  (d)  show  the  reconstructed  structures 
at  the  cross  sections  1  and  2  illustrated  in  Figure  4  (b), 
n'spectively.  As  shown  in  Figure  4  (c)  and  (d),  the  struc¬ 
ture  reconst  ructed  from  sp('cularities  at  the  cross  section 
2  is  different  from  the  real  object  surface  reconstructed 
by  Lambertian  reflections. 

The  future  direction  of  our  studies  is  the  integration  of 
many  cues  in  the  light  of  active  vision  [2].  Active  vision 
involves  not  only  the  modeling  of  physical  sensing  and 
data  |)rocessing  for  vision  modules  (local  model),  but 
also  the  control  of  the  modules  (global  model).  Global 
models  characterize  the  overall  performance  and  make 
j)redictions  on  how  tiu*  individual  modules  will  interact, 
which  in  turn  determines  how  intermediate  results  are 
combined.  It  i.s  the  global  model  that  analyzes  and  com¬ 
bines  the  information  from  many  visual  cues  to  assign 
stable  descriptors.  For  more  stable  descriptions  of  re¬ 
flection  properties  in  more  general  environments,  it  is 
desirable  to  extract  extra  information  from  a  synergistic 
combination  of  multiple  cues.  The  spectral  tlifferencing 
algorithm  demonstrates  the  synergy  from  the  combina¬ 
tion  of  color  and  multiple  views.  There  are  also  poten¬ 
tials  for  extra  information  from  the  combination  of  color, 
polarization  and  !nultip|f‘  vi.'w<. 


Figure  4:  View  sampling 


139 


4  Surface  and  Volumetric  Segmentation 
of  Complex  3-D  Objects  Using 
Parametric  Shape  Models 

The  problem  of  part  definition,  description,  and  decom¬ 
position  is  central  to  shape  recognition  systems.  In  this 
paper  we  present  an  integrated  framework  for  segment¬ 
ing  dense  range  data  of  complex  3-D  scenes  into  their 
constituent  parts  in  terms  of  surface  (bi-quadrics)  and 
volumetric  (superquadrics)  primitives,  without  a  priori 
domain  knowledge  or  stored  models.  Our  objective  is 
to  recover  a  structured  description  of  complex  3-D  ob¬ 
jects,  guided  entirely  by  the  geometric  properties  of  the 
shape  models.  The  re.sulting  decomposition  into  parts 
is  very  useful  for  the  high-level  processes,  which  can  at¬ 
tach  domain  specific  labels  to  the  parts,  and  reason  at  a 
level  where  the  visual  input  is  structured  in  terms  of  ge¬ 
ometric  primitives,  rather  than  cope  with  the  difficulties 
of  low-level  vision  and  a  huge  amount  of  unstructured 
data. 

Since  the  shapes  have  to  be  recovered  from  raw  data, 
it  is  not  possible  to  invoke  complex  models  (models  with 
hundreds  of  degrees  of  freedom)  straight  away.  It  is, 
howev<M'.  feasible  and  perceptually  less  ambiguous  to  use 
simpler  but  powerful  models  that  can  capture  the  local 
and  global  pro|)erties  of  the  object  shapes,  and  provide  a 
first  approximation  to  the  more  complex  models.  With 
computability,  simplicity,  and  the  utility  of  the  shape 
representation  as  our  major  concerns,  we  use  bi-quadrics 
and  superquadrics  as  our  surface  and  volumetric  models 
respectively.  We  develop  SUPERSEG  (SUPERquadric 

5  KG  mentation),  a  control  structure  to  effectively  carry 
out  the  decomposition  of  complex  objects  in  range  im- 
ag<'.'^.  and  address  the  numerous  issues  encountered  in  a 
data-driven  bottom-up  approach  [13;  14;  15]. 

1‘he  SUPERSEG  system  5  has  five  major  components: 
namely,  the  bi-quadric  surface  segmentation  module:  the 
module  for  extracting  surface  properties  and  adjacency 
relationships;  the  superquadric  model  recovery  module; 
the  residual  generation  and  analysis  module;  and  the 
control  module  for  superquadric-based  segmentation. 

4.1  Surface  Segmentation:  Bi-quadric  Models 

The  surface  segmentation  is  performed  by  a  novel  local- 
to-global  iterative  regression  approach  of  searching  for 
the  best  piecewise  description  of  the  data  in  terms  of 
bi-quadric  models  [16;  17].  The  model-recovery  mod¬ 
ule  consists  of  independently  extrapolating  all  the  seed- 
regions  and  fitting  the  model  using  the  least-squares  re¬ 
gression  method.  The  region-growing  is  controlled  by 
a  compatibility- consirainiy  whose  value  depends  on  the 
noise  due  to  sensor  and  quantization,  as  well  as  the  al¬ 
lowed  tolerance  between  the  shapes  of  the  model  and  un¬ 
derlying  data.  Seed-regions  are  placed  in  a  grid-pattern 
all  over  the  image,  and  allowed  to  grow  until  they  are  ei¬ 
ther  completely  grown  or  rejected  by  the  model-selection 
procedure  (which  maximizes  a  linear  benefit-cost  func¬ 
tion).  Instead  of  first  growing  all  the  regions  and  then 
invoking  the  model-selection  procedure  (Re cover- then- 
select),  the  model-recovery  and  model-selection  pro¬ 
cesses  are  dynamically  combined  ( Recover- and-select)  to 


Range 

image 

ziz 


3 


Preprocessing 
-  Uniform  Scaling 


Surfaco  Sagmentatlon 

Superquadric  Modal 

*  Biquadric  Surface  Fining. 

*  Model  Recovered  for 

*  Search  For  Beat  Description. 

Given  Data. 

*  Recover-and-aelect. 

Surface  Daacriptlon 

-  Step  (C_0)  Boundaries 

-  IntemaqC.l)  Boundaries 

-  Suriace  type/Orientaiion 
«« Surface  Adjacency  Graphs 


( Apparent  ^ 
Contour  . 


^ - V 

f  Goodness-oMil,  \ 
^Distance  Measur^ 


r  Reconstructed  j 

31 


EVALUATION  &  INTEGRATION 

•  Residual  Analysie  tor  Part  Hypotheses. 

*  Extrapolation  (growth)  of  Part-models. 

’  Negative  Volume/Concavity  Description. 


Contour/Suriace  Residuals 
Over/underestimated  Regions 


Superquadric  Model  Evaluation 


CONTROL  MODULE 


Figure  5:  The  SUPERSEG  system: 
surface  and  volumetric  segmentation. 


A  framework  for 


achieve  a  computationally  feasible  and  robust  method 
capable  of  rejecting  outliers  and  determining  its  domain 
of  applicability. 

4.1.1  Refining  Surface  Segmentation  &: 

Extracting  Surface  properties 
The  bi-quadric  segmentation  achieved  by  the  above 
procedure  needs  refinement  before  it  can  be  used  as  an 
intermediate  segmentation  by  superquadric-based  vol¬ 
ume  segmentation.  Also,  the  coefficients  of  the  second- 
order  surfaces  have  information  about  orientation  and 
surface-type  (convex  or  concave)  inherent  in  them.  The 
orientation  information  is  tremendously  useful  in  align¬ 
ing  the  major  axis  of  cylindrical  superquadric  models. 
Further,  due  to  the  compatibility-constraint,  regions  in¬ 
tersecting  to  form  surface  normal  discontinuities  (Ci) 
overlap  in  the  vicinity  of  the  discontinuity,  thereby  local¬ 
izing  it.  We  developed  a  systematic  method  for  tracing 
the  biquadric  intersection  curve,  which  is  used  to  refiiu’ 
the  segmentation  as  well  as  to  localize  the  discontinuitif:'^ 
(edges)  and  to  characterize  them  as  convex  or  concave. 
In  addition,  a  surface  adjacency  graph  (SAG)  is  con¬ 
structed  with  surface  patches  as  nodes  and  discontinuitv- 
type  as  edges  between  them.  The  information  ex^^racted 
from  the  bi-quadric  patches  is  used  to  generate  and  te>t 
hypotheses  by  the  volumetric  segmentation  module. 

4.2  Superquadrics:  Volumetric  Part- Models 

Superquadric  models  are  convex  part-models  (except  the 
bent  models)  that  can  be  recovered  for  a  given  set  of 
3-D  points  by  minimizing  a  function  based  on  the  rnocii- 
fied  implicit  inside-outside  superquadric  function  [lb: 


140 


Figure  (>:  Tho  NIST  (>hj(H;t:  Top:  The  range  image  and  its  hi-quadric  surface  segmentation.  Center:  theT/i  (surface 
normal)  edges  marked  at  the  overlapping  |)arts  of  the  surfaces.  Following  a  procedure  similar  to  the  intersection 
cleaning,  the  edges  are  marked  as  convex  or  concave  and  a  surface  adjacency  graph  (SAG)  is  constructed.  Bottom: 
The  three  iterations  of  tin'  global-tolocal  procedure  to  extract  the  part-structure. 


15].  Tins  fonnuialion  enforces  a  minimum  volume  con* 
strain!  as  well  as  a  surface  constraint,  but  is  incapable 
of  decomposing  the  data  set  if  no  appropriate  convex 
model  can  be  found  in  the  model  vocabulary.  Thus,  the 
superquadric  model  recovery  module  is  adequate  only 
for  recovering  an  optimal  model  (if  oriented  correctly) 
given  a  data  set,  but  not  for  segmenting  it.  To  decide 
whether  a  recovered  model  is  adequate  for  the  given  data 
set,  we  have  developed  an  exhaustive  set  of  criteria  com¬ 
prised  of  qualitative  and  quantitative  measures.  Quan¬ 
titative  measures  are  the  nornicJized  global  deviation  of 
the  model  from  data.  The  deviation  can  be  the  inside- 
outside  function  value,  or  can  be  measured  along  the 
direction  of  the  viewpoint  (Z-residuals  for  a  range  scan¬ 
ner),  or  along  the  direction  of  the  minimum  distance  of 
a  point  from  the  model  (Euclidean  distance).  The  qual¬ 
itative  measures  are  the  ‘local'  residuals  characterized 
by  the  clusters  of  3-D  points  that  are  either  inside  the 
model,  or  on  the  model,  or  outside  the  model.  Both 
qualitative  and  quantitative  measures  are  necessary  for 
complete  evaluation  of  a  recovered  model. 

4.3  Volumetric  Segmentation:  The  Control 
Strategy 

In  view  of  the  fact  that  volumetric  models  don’t  have 
good  surface  support  (as  opposed  to  bi-quadric  models), 
they  cannot  be  recovered  by  following  exclusively  the  ex¬ 
trapolation  method  (local-to-global)  used  by  bi-quadrics. 
In  order  to  obtain  an  optimal  piecewise-convex  volumet¬ 
ric  segmentation,  it  is  necessary  to  proceed  global- to- 
local,  where  data  is  decomposed  only  if  the  global  model 
is  inadequate.  This  allows  controlled  residual-driven  de¬ 
composition  of  3-D  data,  as  also  introduction  of  an  ob¬ 
jective  evaluation  criteria  for  an  acceptable  description. 
However,  the  global-to-local  method  can  be  aided  by  the 
bi-quadric  segmentation  in  forming  hypotheses  about 
convex  combination  of  surfaces,  which  although  is  not 
true  in  general  (an  L  shape  for  example),  can  signifi¬ 
cantly  reduce  the  computational  overhead  if  true  for  a 
particular  part.  Previous  researchers  have  assumed  that 
a  1-to-l  mapping  exists  between  surface  patches  and  su- 
perquadric  models,  which  is  also  not  true  in  general.  But 
it  does  provide  a  planarity  check  for  the  patches,  as  well 
as  the  orientation  and  shape  of  the  individual  patches  in 
3-space. 

Thus,  a  strategy  that  combines  the  bi-quadric  infor¬ 
mation  with  the  global-to-local  residual-driven  method 
is  most  effective  in  recursively  segmenting  the  scene  to 
derive  the  part-structure  [13].  A  set  of  acceptance  crite¬ 
ria  based  on  the  quantitative  and  qualitative  measures 
provide  the  objective  evaluation  of  intermediate  descrip¬ 
tions,  and  decide  whether  to  terminate  the  procedure, 
or  selectively  refine  the  segmentation,  or  generate  nega¬ 
tive  volume  description.  The  control  module  generates 
hypotheses  about  superquadric  models  at  clusters  of  un¬ 
derestimated  data  and  performs  controlled  extrapolation 
of  part-models  by  shrinking  the  global  model.  The  re¬ 
cursive  splitting  of  data  results  in  a  hierarchical  part- 
structure  comprising  of  global  and  local  models.  The 
results  of  complete  processing  of  the  range  image  of  a 
machined  object  (from  NIST)  is  shown  in  Figure  6. 


WV  have  tested  the  SUPEILSECi  sys(ein  ou  real  raim. 
images  of  scenes  of  varying  complexity,  including  object  ^ 
with  occluding  parts,  and  scenes  where  surface  segmen¬ 
tation  is  not  sufficient  to  guide  the  volumetric  segment  a 
tion.  Some  of  the  applications  of  our  approach  includ* 
data  reduction,  3-D  object  recognition,  geometric  mod¬ 
eling,  automatic  model  generation,  object  manipulation, 
qualitative  vision,  and  active  vision. 

5  A  Framework  for  Visual  Observation 

In  this  work  we  establish  a  framework  for  the  general 
problem  of  observation,  which  may  be  aj)plied  to  dif¬ 
ferent  kinds  of  visual  tasks.  We  define  “intelligent 
high-level  control  mechanisms  for  the  observer  in  ordt  r 
to  achieve  efficiency  in  recognizing  different  processr^ 
within  a  specific  dynamic  system.  The  intelligent  ob¬ 
server  is  able  to  recognize  the  visual  tasks,  understands 
the  meaning  of  the  scene  evolution  and  successfully  n  - 
ports  on  the  current  visual  state.  It  is  obvious  that  therr 
is  a  need  for  high-level  interpretation  of  actions  within 
the  environment  and  to  have  guarantees  for  observation 
capabilities  and  stability  within  the  viewing  mechanism. 
The  framework  is  a  predictable  one  that  satisfies  the  fol¬ 
lowing  general  requirements: 

•  Recognizes  visual  tasks  and  events. 

•  Repositions  itself  adaptively  and  intelligently. 

•  Operates  in  real  time. 

•  Asserts  and  reports  on  distinct  and  discrete  visual 
states. 

•  Utilizes  the  continuous  parametric  evolution  of  tin* 
visual  system. 

•  Accommodates  visual  uncertainties. 

We  concentrate  on  observing  a  manipulation  process 
in  order  to  illustrate  the  ideas  and  motive  behind  our 
framework.  The  process  of  observing  a  robot  hand  ma¬ 
nipulating  an  object  is  very  crucial  for  many  robotic  and 
manufacturing  tasks.  It  is  important  to  know  in  an  au¬ 
tomated  manufacturing  environment  whether  the  robot 
hand  is  doing  the  correct  sequence  of  operations  on  an 
object  (or  more  than  one  object).  It  might  be  a  fact  that 
the  workspace  of  the  robotic  manipulator  cannot  be  ac¬ 
cessed  by  humans,  as  in  the  case  of  some  space  applica¬ 
tions  or  some  areas  within  a  nuclear  plant,  for  example. 
In  such  a  case,  having  another  robot  “look”  at  the  pro¬ 
cess  is  a  very  good  option.  Thus,  the  observation  process 
can  be  thought  of  as  a  stage  in  a  closed- loop  fully  auto¬ 
mated  system  where  there  are  robots  who  perform  the 
required  manipulation  task  and  some  other  robots  who 
observe  them  and  correct  their  actions  when  something 
goes  wrong.  Typical  manipulation  proce.sses  incbide 
grasping,  pushing,  pulling,  lifting,  squeezing,  screv  ng 
and  unscrewing.  In  this  work,  we  address  the  prob;em 
of  observing  a  single  hand  manipulating  a  single  obj^'ct 
and  recognizing  what  the  hand  is  doing.  No  feedback 
will  be  supplied  to  the  manipulating  robot  to  correct  its 
actions.  We  divide  the  problem  into  three  major  com¬ 
ponents.  First,  we  identify-  a  high-level  framework  for 
the  visual  states.  Next,  we  define  the  events  that  cause 


142 


2>0  Data 


1  • 


- - 

Recovered  3-D 
Uncertainty  Models 


3~D  Data 


Figure  8:  Propagation  of  Uncertainty 


Figure  7:  A  Model  for  a  Grasping  Task 

state  transitions.  Finally,  we  utilize  visual  uncertainties 
to  assert  the  state  of  the  system. 

5.1  State  Space  Modeling 

VVe  use  a  discrete  event  dynamic  system  as  a  high-level 
structuring  skeleton  to  model  the  visual  manipulation 
system.  Discrete  event  dynamic  systems  (DEDS)  are  dy¬ 
namic  systems  (typically  asynchronous)  in  which  state 
transitions  are  triggered  by  the  occurrence  of  discrete 
<‘veiits  in  the  system.  Our  formulation  uses  the  knowl- 
(xlge  about  the  system  and  the  different  actions  in  or¬ 
der  to  solve  the  observer  problem  in  an  efficient,  sta- 
i)le  and  practical  way.  The  model  incorporates  differ- 
eiit  hand/object  relationships  and  the  possible  errors  in 
the  manipulation  actions.  It  also  uses  different  tracking 
mechanisms  .so  that  the  observer  can  keep  track  of  the 
workspace  of  the  manipulating  robot.  A  framework  is 
developed  for  the  hand/object  interaction  over  time  and 
a  .stabilizing  observer  is  constructed.  The  construction 
proce.ss  utilizes  a  task-dependent  coarse  quantization  of 
the  manipulation  actions  in  order  to  attain  an  active, 
adaptive  and  goal-directed  sensing  mechanism.  An  ex¬ 
ample  of  a  DEDS  automaton  for  a  simple  grasping  task 
is  shown  in  Figure  7. 

5.2  Event  Identification 

Low-level  modules  are  developed  for  recognizing  the 
“events"  that  cause  state  transitions  within  the  dynamic 
manipulation  system.  To  be  able  to  observe  how  the 
events  evolve  over  time,  we  must  be  able  to  identify  how 
the  hand  muves  and  how  the  Iiand/object  physical  re- 
lation.ship  <'voivps  over  time.  We  use  a  mix  of  2-D  and 


3-D  modules  to  recover  a  set  of  parameters  that  define 
the  continuous  parametric  evolution  of  the  scene  under 
observation.  Three  dimensional  evolution  of  the  hand 
motion  is  recovered  by  tracking  a  set  of  features  and 
two-dimensional  cues  to  the  number  of  objects  and  their 
relative  location;  two  dimensional  motion  with  respect 
to  the  manipulating  hand  is  recovered  in  real-time.  The 
recovered  events  are  then  used  to  assert  state  transitions 
within  the  DEDS  automata.  We  also  recover  uncertain¬ 
ties  associated  with  the  visual  event  recovery  and  utilize 
them  for  navigating  the  observer  automata. 

5.3  Utilizing  Uncertainties 

This  work  examines  closely  the  possibilities  for  errors, 
mistakes  and  uncertainties  in  the  visual  manipulation 
system,  ob.server  construction  process  and  event  identifi¬ 
cation  mechanisms.  We  divide  the  problem  into  a  num¬ 
ber  of  major  levels  for  developing  uncertainty  models  in 
the  observation  process.  The  propagation  of  uncertainty 
is  shown  in  Figure  8. 

The  sensor  level  models  deal  with  the  problems  in 
mapping  3-D  features  to  pixel  coordinates  and  the  errors 
incurred  in  that  process.  We  identify  these  uncertainties 
and  suggest  a  framework  for  modeling  them.  The  next 
level  is  the  extraction  strategy  level,  in  which  we  develop 
models  for  the  possibility  of  errors  in  the  low-level  image 
processing  modules  used  for  identifying  features  that  are 
to  be  used  in  computing  the  2-D  evolution  of  the  scene 
under  consideration.  In  the  following  level,  we  utilize  the 
geometric  and  mechanical  properties  of  the  hand  and/or 
object  to  reject  unrealistic  estimates  for  2-D  movements 
that  might  have  been  obtained  from  the  first  two  lev¬ 
els.  We  transform  the  2-D  uncertainty  models  into  3- 
D  uncertainty  models  for  the  structure  and  motion  of 
the  entire  scene.  The  next  level  uses  the  (Mpiatioiis  that 


143 


Figure  9:  Experimental  Setting 


govern  the  2-D  to  3-D  relationship  to  perform  the  con¬ 
version.  We  then  reject  the  improbable  3-D  uncertainty 
models  for  motion  and  structure  estimates  by  using  the 
existing  infoimatioii  about  the  geometric  and  mechanical 
properties  of  the  moving  components  in  the  scene.  The 
highest  level  is  the  DEDS  formulation  with  uncertainties, 
in  which  state  transitions  and  event  identification  is  as¬ 
serted  according  to  the  3-D  models  of  uncertainty  that 
were  developed  in  the  previous  levels,  and  error  recovery 
is  performed  according  to  the  ordering  of  the  recovered 
distributions. 

5.4  Conclusions 

The  approach  used  can  be  considered  cis  a  framework  for 
a  variety  of  visual  tasks,  as  it  lends  itself  to  be  a  prac- 
Ucal  and  feasible  solution  that  uses  existing  information 
in  a  robust  and  modular  fashion.  The  work  examines 
closely  the  possibilities  for  errors  and  uncertainties  in 
the  manipulation  system,  observer  construction  process 
and  event  identification  mechanisms.  Ambiguities  are  al¬ 
lowed  to  develop  and  are  resolved  after  finite  time;  recov¬ 
ery  mechanisms  are  devised  too.  Details  of  the  observer 
system  can  be  found  in  [20;  21;  22;  23].  Theoretical  and 
experimental  aspects  of  the  w^ork  support  adopting  the 
frame w^ork  as  a  new  basis  for  performing  task-oriented 
recognition,  inspection  and  observation  of  visual  phe¬ 
nomena.  The  observer  and  manipulating  robots  experi¬ 
mental  setup  is  shown!  in  Figure  9. 

6  Spatio- Variant  Sensing 

Traditional  imaging  for  robotics  vision  has  relied  al¬ 
most  exclusively  on  common  commercial  imagers,  no¬ 
tably  television  format  sensors.  Their  advantages  are 
clear,  the  cameras  are  inexpensive  and  readily  available, 
and  the  sampling  of  the  data  is  on  a  ” natural”  carte¬ 
sian  (x,y)  grid.  These  sensors  have  placed  enormous  de¬ 
mands,  how^ever,  on  processing  architectures.  The  prob¬ 
lem  is  not  only  that  image  analysis  is  an  ill-defined  task 


in  thr  real  world,  but  that  we  liaw  only  Ncry  (‘xpcnisivc 
machines  that  can  begin  to  jiroce.ss  thr  data. 

Over  the  last  seven  years  an  iniernational  team,  led  by 
Van  dor  Spiegel  at  the  University  of  Pennsylvania.  San- 
dini  at  DIST  in  Italy,  and  Claoys  at  I.MEC’  in  Belgium, 
designed,  built,  and  tested  a  new  imaging  chip  called  the 
Retina  [24].  The  new  camera  serves  as  the  foundation 
to  a  new’  approach  to  robotics  vision.  We  sliift  the  focus 
at  the  .systems  level  from  gatliering  better  data  and  d('- 
signing  machines  to  analyze  it  to  gathering  data  for  the 
computing  resources  that  exist.  The  result  is  a  jirototyp*' 
sensor  that  reduces  the  computational  complexity  of  tlie 
problem  by  three  orders  of  magnitude  and.  if  scaled  to 
commercial  cameras,  by  six  orders  [25]. 

The  Retina  attempts  to  model  the  gross  characteri.s- 
tics  of  the  primate  visual  .system  in  a  mathematically 
elegant  w’ay.  The  computational  savings  arise  from  tlu' 
same  mechanism  the  eye  uses,  namely,  to  maintain  oik' 
area  of  high  resolution  on  the  focal  plane  and  to  droj) 
the  resolution  elsewhere.  The  mathematical  expre.ssion 
of  this  is  a  log-polar  mapping.  That  mapping  trans¬ 
forms  a  polar  data  space,  wdiere  a  point  has  tlie  polar 
coordinates  (rjheia),  by  taking  the  logarithm  of  the  ex¬ 
pression  for  the  point: 

r  =  rc'^  P''"  =  /„(,.)  +  iO  =  u  +  iv 

This  mapping  has  the  useful  property  of  .separating  ro¬ 
tations  (changes  in  t.lieia)  from  magnifications  (chang(\s 
in  r).  If  the  sensor  has  a  uniform  .sami)ling  grid  in  ?/  (In 
(r)),  then  the  spatial  grid  in  r  will  ex|)onentially  grow'  as 
distance  from  the  center  grows.  This  models  the  grow’th 
of  the  receptive  fields  in  primate  retinas. 

The  Retina  layout  in  Figure  10  implements  this  ma|)- 
ping  by  sampling  in  {rjheta)  at  points  matching  a  uni¬ 
form  (u,v)  grid.  The  sensor  clearly  has  rotational  sym¬ 
metry  and  exponentially  decreasing  resolution.  The  cir¬ 
cular  section  contains  only  1920  pixels  (30  circles  of  64 
pixels/circle);  at  the  center  is  a  dense  rectangular  grid  of 
102  additional  photosites  [26].  The  cells  grow’  fast:  the 
outermost  circle  is  over  ten  times  as  wude  as  the  inner¬ 
most.  This  leads  directly  to  the  small  pixel  count. 

The  chip,  with  its  custom  driving  electronics,  is  now’ 
working  at  the  GRASP  laboratory  [27]  and  is  producing 
good  pictures  as  showni  in  Figure  11. 

Clearly  visible  in  the  data  space  is  the  large  magni¬ 
fication  of  the  inner  circles.  The  outer  section  provides 
much  poorer  data,  with  pixels  widely  spaced  and  aver¬ 
aging  the  incident  light  over  a  larger  area.  Still  they  do 
not  provide  useless  information. 

The  nature  of  the  information  has  changed,  how’ever. 
No  longer  do  we  get  high  quality  data  across  the  foe  il 
plane.  Indeed,  we  assume  from  the  start  that  we  do  not 
try  to  build  a  model  of  the  world  in  one  step  Instead, 
we  use  the  periphery  to  guide  our  atteiuion — where  we 
point  the  camera.  Implicit  here  is  the  idea  of  an  activ  e 
observer.  The  Retina,  just  sitting  on  a  bench  waiting 
for  an  object  to  enter  its  high-resolution  spot,  is  useles.s. 
We  must  actively  build  the  world  by  moving  the  camera, 
using  the  periphery  to  suggest  candidates  for  attention. 

The  cost  of  using  this  sensor  might  be  considered  high. 
The  new’  data  space  will  require  rew’riting  or  adapting 


Figure  10:  The  Retina  CCD  Imager 


liguie  11.  lecture  of  a  mouse  from  tlie  camera,  centered  between  the  buttons  (to  the  left)  and  ball.  The  picture  on 
the  left  is  in  tlie  mapped  plane:  tlie  vertical  axis  is  v  (e,  the  angle  of  the  point,  increases  moving  down  the  axis)  and 
tlie  horizontal  is  u  (u,  tin'  log  of  the  radial  distance  of  tlie  point,  increases  to  the  right).  The  triangh'  at  the  upper 
lelt  of  th<'  image  i.s  data  laniiapped  back  onto  a  cartesian  grid. 


all  our  tools  for  the  cartesian  plane:  this  is  the  primary 
cost  outside  the  hardware  development.  The  advantages, 
liowever,  suggest  profit.  The  Retina  has  some  one  hun¬ 
dred  times  fewer  pixels  than  a  standard  television  cam¬ 
era,  which  drastically  reduces  the  computational  burden 
of  analysis,  bringing  it  within  the  abilities  of  modern 
machines.  The  gains  also  include  the  rich  mathemati¬ 
cal  structure  of  the  mapping.  That  structure  simplifies 
pattern  matching  by  making  rotations  and  magnifica¬ 
tions  linear  shifts  in  the  data  space,  and  speeds  time-to- 
impact  measurements  by  looking  only  at  a  radial  flow. 
Some  distortions  introduced  by  the  mapping,  such  as 
translational  variance  (which  is  linear  translations  be¬ 
coming  curves  in  the  data  space)  also  disappear  in  an 
active  observer,  where  for  example  attention  and  track¬ 
ing  automatically  compensate  for  linear  motion. 

Since  the  sensor  began  working  this  summer,  our  focus 
at  the  GRASP  laboratory  has  been  redeveloping  tradi¬ 
tional  image  processing  tools.  Our  work  has  looked  at 
edge  detection  in  the  new  data  space,  detecting  lines  us¬ 
ing  a  Hough  algorithm,  calculating  the  centroid  of  an  ob¬ 
ject,  and  measuring  time-to-impact.  Each  of  these  areas 
requires  an  analysis  of  their  mathematical  basis  under 
the  log  mapping  and  coding  the  results  on  real  images. 
All  algorithms  must  further  be  computationally  simple 
to  work  in  a  real-time  environment. 

This  integration  of  sensor  and  computer  is  now  the 
fundamental  area  of  research  involving  the  Retina  at 
Penn.  That  the  Retina  works  proves  the  concept  of 
the  hardware,  of  designing  custom  imaging  sensors  for 
robots.  The  integration  itself  will  prove  the  concept  of 
the  system.  The  Retina  is  the  basic  building  block  for  a 
real-time  interactive  observer. 

7  Conclusions  and  future  plans 

The  development  of  an  Active  Observer  is  underway 
at  the  GRASP  laboratory.  Although  future  emphasis 
will  be  placed  on  the  control  structure  of  such  an  ob¬ 
server,  its  integration  policies,  and  communication  issues 
with  other  observers  and  agents  in  general,  there  is  still 
a  need  for  further  studies,  developments  and  improve¬ 
ments  of  component  technologies.  For  example,  in  the 
case  of  understanding  surface  reflectance,  we  still  have 
not  completed  the  theoretical  underpinning  of  trans¬ 
parency.  With  the  problem  of  segmentation,  while  the 
cooperation  between  surface  and  volumetric  fittings  is 
necessary,  and  they  help  in  resolving  ambiguities,  the 
first  and  second  order  primitives  are  clearly  not  sufficient 
for  modeling  a  broad  class  of  real  life  objects.  Higher 
order  models  will  have  to  be  invoked,  but  only  selec¬ 
tively  and  locally  after  the  lower  order  fits  have  failed.  If 
this  order  of  fitting  data  is  violated  then  instabilities  in 
the  fitting  procedures  can  be  expected.  Finally,  there  is 
the  question  of  the  control  mechanism  of  the  Active  Ob¬ 
server.  As  shown  above,  we  have  employed  the  Discrete 
Event  Dynamic  System  model.  DEDS  is  a  suitable  for¬ 
malism  to  model  continuous  processes  of  observation,  as 
well  as  events  occuring  in  discrete  intervals.  As  a  result, 
this  model  allows  us  to  predict  the  observation  capabil¬ 
ity  as  defined  by  the  control  theory  community.  The 
assumption  here,  however,  is  that  the  task  of  observa¬ 


tion  is  a  prion  in  terms  of  the  discrete  events.  Wiiiie  in 
the  original  theory  the  transitions  from  one  state/ev('nl 
to  another  were  discrete,  we  have  extended  the  theory 
to  transitions  with  uncertainties.  The  next  task  should 
be  to  loosen  the  requirements  for  explicit  knowledge  of 
the  desired  observable  events.  These  events  should 
able  to  be  generated  from  some  rules  of  physics,  geom¬ 
etry  and  other  conventions  of  the  object’s  and  agent  s 
interactions.  In  conclusion,  we  are  on  our  way  to  com¬ 
plete  an  Active  Observer  which  has  a  control  structure 
that  allows  us  to  predict  observation  capabilities.  The 
components  developed  here  allow  the  Active  Observer  to 
handle  moderately  complex  scenes  of  shapes/materials, 
their  spatial  arrangements  and  their  illuminations.  The 
real  time  issue  of  processing  is  a  crucial  one  and  hence 
our  efforts  in  special  purpose  CCD  chips  and  related 
hardware.  The  open  questions  are  many  but  we  wish 
to  concentrate  on  the  intercommunication  of  several  ob¬ 
servers  and  other  agents,  such  as  manipulatory,  mobile 
and  human  agents.  Ultimately,  the  final  issue  is  this: 
who  tells  what  and  how  much,  and  to  whom. 

References 

[1]  J,  Aloimonos  and  A.  Badyopadhyay.  Active  vi¬ 
sion.  In  Proc.  Isi  Ini.  Conf.  on  Computer  Vision. 
pages  35-54,  1987. 

[2]  R.  Bajesy.  Active  perception.  Proceedings  of  the 
IEEE,  76(8):996-1005,  1988. 

[3]  R.  Bajesy,  S.W.  Lee,  and  A.  Leonardis.  Color  im¬ 
age  segmentation  with  detection  of  highlights  and 
local  illumination  induced  by  inter-reflections.  In 
Proc.  lOih  International  Conf.  on  Pattern  Recogni¬ 
tion,  Atlantic  City,  NJ,  June  1990. 

[4]  E.  N.  Coleman  and  R.  Jain.  Obtaining  3- 
dimensional  shape  of  textured  and  specular  surface 
using  four-source  photometry.  Computer  Graphics 
and  Image  Processing,  18(4):308-328,  1982. 

[5]  R.  Gershon.  The  Use  of  Color  in  Computational  K?- 
sion.  PhD  thesis,  Department  of  Computer  Science, 
University  of  Toronto,  1987. 

[6]  G.H.  Healey  and  T.O.  Binford.  Using  color  for 
geometry-insensitive  segmentation.  Journal  of  the 
Optical  Society  of  America,  6,  1989. 

[7]  T.  Kanade  and  K.  Ikeuchi.  Introduction  to  the  spe¬ 
cial  issue  on  physical  modeling  in  computer  vision. 
IEEE  Trans.  PAMI,  13(7):609-610,  1991. 

[8]  G.J.  Klinker,  S.A.  Shafer,  and  T.  Kanade.  Image 
segmentation  and  reflection  analysis  through  C‘  :or. 
In  Proceedings  of  the  DARPA  Image  Understan  '.ing 
Workshop,  pages  838-853,  Pittsburgh,  PA,  1988. 

[9]  S.  K.  Nayar,  K.  Ikeuchi,  and  T.  Kanade.  Deter¬ 
mining  shape  and  reflectance  of  hybrid  surface '  by 
photometric  sampling.  IEEE  Trans.  Robo.  Auiom., 
6(4):418-431,  1990. 

[10]  S.A.  Shafer.  Using  color  to  separate  reflection 
components.  COLOR  .Research  and  Application, 
10(4);210-218,  1985. 


146 


[11]  H,D.  Tagare  and  R.  J.  deFigueiredo.  Photomet¬ 
ric  stereo  for  diffuse  non-lambertian  surface.  IEEE 
Trans.  PAMI,  13():,  1991. 

[12]  L.  B.  Wolff.  Polarization  Methods  in  Computer  Vi¬ 
sion.  PhD  thesis,  Department  of  Computer  Science, 
Columbia  University,  1991. 

[13]  Gupta  Adok,  Surface  and  Volumetric  Segmentation 
of  Complex  3-D  Objects  Using  Parametric  Shape 
Models,  Technical  Report  MS-CIS-91-45,  Depart¬ 
ment  of  Computer  and  Information  Science,  Uni¬ 
versity  of  Pennsylvania,  1991. 

[14]  Gupta,  Alok  and  Ruzena  Bajcsy,  Part  description 
and  segmentation  using  contour,  surface  and  volu¬ 
metric  primitives,  in  Proceedings  of  the  Conference 
on  Sensing  and  Reconstruction  of  3D  Objects  and 
Scenes,  pp.  203-214,  SPIE,  Santa  Clara,  CA,  Feb 
1990. 

[15]  Gupta,  Alok,  Luca  Bogoni,  and  Ruzena  Bajcsy, 
Quantitative  and  qualitative  measures  for  the  eval¬ 
uation  of  the  superquadric  models,  in  Proceedings  of 
the  IEEE  Workshop  on  Interpretation  of  3D  Scenes, 
pp.  162-169,  Austin,  TX,  November  1989. 

[16]  Leonardis,  Ales,  Alok  Gupta,  and  Ruzena  Bajcsy, 
Segmentation  as  the  search  for  the  best  description 
of  the  image  in  terms  of  primitives,  in  Proceedings 
of  the  Third  International  Conference  on  Computer 
Vision,  pp.  121-125.  IEEE,  Osaka,  Japan,  Decem¬ 
ber  1990a. 

[17]  Leonardis,  Ales,  Alok  Gupta,  and  Ruzena  Bajcsy, 
Segmentation  of  Range  Images  as  the  Search  for 
the  Best  Description  of  the  Scene  in  Terms  of  Ge¬ 
ometric  Primitives,  Technical  Report  MS-CIS-90- 
30,  CIS  Department.  University  of  Pennsylvania, 
1990b. 

[18]  Solina,  Franc,  Shape  Recovery  and  Segmentation 
xvith  Dcfonnablc  Pari  .Models,  PhD  thesis,  Univer¬ 
sity  of  Pennsylvania.  1987,  Technical  Report  MS- 
CIS-87-111. 

[19]  Solina,  F.  and  R.  Bajcsy,  Recovery  of  paramet¬ 
ric  models  from  range  images:  the  case  for  su¬ 
perquadrics  with  global  deformations,  IEEE  Trans, 
on  Pat  tern  Analysts  and  Machine  Intelligence,  12, 
131-147,  February  1990. 

[20]  R.  Bajcsy  and  T.  M.  Sobh,  .4  Framework  for  Ob¬ 
serving  a  Manipulation  Process.  Technical  Report 
MS-ClS-90-34  and  GRASP  Lab.  TR216,  Computer 
Science  Dept.,  School  of  Engineering  and  Applied 
Science.  University  of  Pennsylvania,  June  1990. 

[21]  T.  M.  Sobh,  A  Frame  work  for  Visual  Observation. 
Technical  Report  MS-(TS-91-36  and  GRASP  Lab. 
TR  261.  Computer  Science  Dept.,  School  of  Engi¬ 
neering  and  Applied  Science,  University  of  Pennsyl¬ 
vania.  .May  1991. 

[22]  T.  M.  Sobh  and  R.  Bajcsy.  "V'isual  Observation  of 
A  Moving  Agent".  In  Proceedings  of  the  European 
Robotics  and  Intelligent  Systems  Conference  (EU- 
RISCO.y  '9!),  Corfu.  Greece.  June  1991  and  pre- 
.sented  :U  the  12^^  lute  rnaiionat  Joint  Conference 


on  Ariijicial  Intelligence  (IJCAI),  Workshop  on  Dy¬ 
namic  Scene  Understanding,  Sydney,  Australia,  Au¬ 
gust  1991. 

[23]  T.  M.  Sobh  and  R.  Bajcsy,  “A  Model  for  Observ¬ 
ing  a  Moving  Agent” .  In  Proceedings  of  the  Fourth 
International  Workshop  on  Intelligent  Robots  and 
Systems  (IROS  ^91),  Osaka,  Japan,  November  1991. 

[24]  J.  Van  der  Spiegel,  G.  Kreider,  et  al.  “A  Foveated 
Retina-Like  Sensor  Using  CCD  Technology”.  In 
Analog  VLSI  Implementations  of  Neural  Systems, 
ed.  C.  Mead  and  M.  Ismail,  pp.  189-211,  Kluwer 
Academic  Publishers,  Boston,  1989. 

[25]  G.  Kreider,  J.  Van  der  Spiegel  et  al.  ‘The  De¬ 
sign  and  Characterization  of  a  Space  Variant  CCD 
Sensor”.  SPIE  Vol.  1381  Intelligent  Robots  and 
Computer  Vision  IX:  Algorithms  and  Techniques, 
Boston,  November  1990. 

[26]  G.  Kreider,  J.  Van  der  Spiegel  et  al.  “A  Retina- Like 
Space  Variant  CCd  Sensor”.  SPIE  Vol.  1242  Charge 
Coupled  Devices  and  Solid  State  Optical  Sensors, 
pp.  133-140,  Santa  Clara,  February  1990. 

[27]  Z.  Kalayjian.  “A  Driver  Circuit  for  the  Foveated 
Retina-Like  Optical  Sensor”.  Final  Report,  Under¬ 
graduate  Fellowship  in  Sensor  Technologies,  Univer¬ 
sity  of  Pennsylvania,  1990. 

A 


147 


/QQ‘0  /3Q7 


4  - 


OFFICE  OF  THE  UNDER  SECRETARY  OF  DEFENSE  (ACQUISITION) 
DEFENSE  TECHNICAL  INFORMATION  CENTER 
CAMERON  STATION 
ALEXANDRIA,  VIRGINIA  22304-6145 


IH  REPLY 
REFER  TO 


DTIC-OCC 


SUBJECT!  Disliibulion  Slalemenls  on  Tochnical  Dopumenls 


TO: 


OFFICE  OF  NAVA!.  RESEARCH 


CORPORA IE  PR0‘ 


RA^vlo  DIVISION 


ONR  353 

800  MORTI!  QUINCY  STREET 
ARLINGTON,  VA  22217-5600 


1.  Reference:  DoD  Directive  5230.24,  Distribution  Statements  on  Teclinical  Documents, 
18  Mar  87.  / 

t 

2  The  Defense  Technical  Information  Center  received  the  enclosed  report  (referenced 
below)  which  Is  not  marked  in  accordance  with  the  above  reference. 

FINAL  REPORT 
N00014-88-K-0630 

TITLE:  SEGMENTATION  OF  SCENES 
IN  EXPLORATORY  MODE 


3.'  We  request  the  appropriate  distribution  statement  be  assigned  and  the  report  returned 
lo  DTiC  within  5  working  days. 


4  Approved  distriblijlion  statements  are  listed  on  the  reverse  ol 

questions  regarding  these  statements,  call  DTIC's  Cataloging  Branch.  (703)  274-6637. 


FOR  THE  ADMINISTRATOR: 


GOPALAKRISHNAN  NAIR 
Chief,  Cataloging  Branch 


FL-171 
Jul  93 


DISTRIBUTION  STATEMENT  A: 


APPROVED  FOR  PUBLIC  RELEASE:  DISTRIBUTION  IS  UNLIMITED 
DISTRIBUTION  STATEMENT  B: 

DISTRIBUTION  AUTHORIZED  TO  U.S.  GOVERNMENT  AGENCIES  ONLY;  ' 

(Indicate  Reason  and  Dale  Below).  OTHER  REQUESTS  FOR  THIS  DOCUMENT  SHALL  BE  REFERRED 
TO  (Indicate  Controlling  DoD  Olfice  Below). 

DISTRIBUTION  STATEMENT  C: 

DISTRIBUTION  AUTHORIZED  TO  U.S.  GOVERNMENT  AGENCIES  AND  THEIR  CONTRACTORS* 
(Indicate  Reason  and  Dale  Below).  OTHER  REQUESTS  FOR  THIS  DOCUMENT  SHALL  BE  REFERRED  ’ 
TO  (Indicate  Controlling  DoD  Ollice  Below). 

DISTRIBUTION  STATEMENT  D: 

1*^ 

DISTRIBUTION  AUTHORIZED  TO  DOD  AND  U.S.  DOD  CONTRACTo|rS  ONLY;  (Indicate  Reason 
and  Date  Below).  OTHER  REQUESTS  SHALL  BE  REFERRED  TO  (Indicate  Controlling  DoD  Ollice  Below). 

DISTRIBUTION  STATEMENT  E: 

DISTRIBUTION  AUTHORIZED  TO  DOD  COMPONENTS  ONLY;  (Indicate  Reason  and  Date  Below). 
OTHER  REQUESTS  SHALL  BE  REFERRED  TO  (Indicate  Controlling  DoD  Ollice  Below). 

DISTRIBUTION  STATEMENT  F: 


FURTHER  DISSEMINATION  ONLY  AS  DIRECTED  BY  (Indicate  Controlling  DoD  Ollice  and  Dale 
Below)  or  HIGHER  DOD  AUTHORITY. 

DISTRIBUTION  STATEMENT  X: 

DISTRIBUTION  AUTHORIZED  TO  U.S.  GOVERNMENT  AGENCIES  AND  PRIVATE  INDIVIDUALS 
OR  ENTERPRISES  ELIGIBLE  TO  OBTAIN  EXPORT-CONTROLLED  TECHNICAL  DATA  IN  ACCORDANCE 
WITH  DOD  DIRECTIVE  5230.25,  WITHHOLDING  OF  UNCLASSIFIED  TECHNICAL  DATA  FROM  PUBLIC 
DISCLOSURE,  6  Nov  1984  (Indicate  dale  of  delermlnallon).  CONTROLLING  DOD  OFFICE  IS  (Indicate 
Controlling  DoD  Ollice).’  . 

The  cited  documents  lias  been  reviewed  by  competent  authority  and  the  following  distribution  statement  Is 
hereby  authorized. 


Cl _  _ OFFICE  OF  NAVAI  RESFARPll 

(S'®'™®"')  I’HOGR/IMS  DIVISION 

800  NORTH  QUINCY  STREET 

ARLINGTON,  VA  22217  5660 


(Reason) 


’^;^HUGIIES’ 

'^rector 

OFFICE 


lignaUire  &  Ty(5e3  Nanlfe) 


(Assigning  Oltice) 


(Controlling  DoD  Office  Name) 


(Controliing  DoD  Office  Address, 
Cily,  Slate.  Zip) 

SBP  fjQ^ 

(Dale  Statement  Assigned) 


